113 research outputs found

    Coordinated inductive learning using argumentation-based communication

    Get PDF
    This paper focuses on coordinated inductive learning, concerning how agents with inductive learning capabilities can coordinate their learnt hypotheses with other agents. Coordination in this context means that the hypothesis learnt by one agent is consistent with the data known to the other agents. In order to address this problem, we present A-MAIL, an argumentation approach for agents to argue about hypotheses learnt by induction. A-MAIL integrates, in a single framework, the capabilities of learning from experience, communication, hypothesis revision and argumentation. Therefore, the A-MAIL approach is one step further in achieving autonomous agents with learning capabilities which can use, communicate and reason about the knowledge they learn from examples. © 2014, The Author(s).Research partially funded by the projects Next-CBR (TIN2009-13692-C03-01) and Cognitio (TIN2012-38450- C03-03) [both co-funded with FEDER], Agreement Technologies (CONSOLIDER CSD2007-0022), and by the Grants 2009-SGR-1433 and 2009-SGR-1434 of the Generalitat de Catalunya.Peer reviewe

    Similarity measures over refinement graphs

    Get PDF
    Similarity also plays a crucial role in support vector machines. Similarity assessment plays a key role in lazy learning methods such as k-nearest neighbor or case-based reasoning. In this paper we will show how refinement graphs, that were originally introduced for inductive learning, can be employed to assess and reason about similarity. We will define and analyze two similarity measures, S λ and S π, based on refinement graphs. The anti-unification-based similarity, S λ, assesses similarity by finding the anti-unification of two instances, which is a description capturing all the information common to these two instances. The property-based similarity, S π, is based on a process of disintegrating the instances into a set of properties, and then analyzing these property sets. Moreover these similarity measures are applicable to any representation language for which a refinement graph that satisfies the requirements we identify can be defined. Specifically, we present a refinement graph for feature terms, in which several languages of increasing expressiveness can be defined. The similarity measures are empirically evaluated on relational data sets belonging to languages of different expressiveness. © 2011 The Author(s).Support for this work came from the project Next-CBR TIN2009-13692-C03-01 (co-sponsored by EU FEDER funds)Peer Reviewe

    The Personalization Paradox: the Conflict between Accurate User Models and Personalized Adaptive Systems

    Get PDF
    Personalized adaptation technology has been adopted in a wide range of digital applications such as health, training and education, e-commerce and entertainment. Personalization systems typically build a user model, aiming to characterize the user at hand, and then use this model to personalize the interaction. Personalization and user modeling, however, are often intrinsically at odds with each other (a fact some times referred to as the personalization paradox). In this paper, we take a closer look at this personalization paradox, and identify two ways in which it might manifest: feedback loops and moving targets. To illustrate these issues, we report results in the domain of personalized exergames (videogames for physical exercise), and describe our early steps to address some of the issues arisen by the personalization paradox.Comment: arXiv admin note: substantial text overlap with arXiv:2101.1002

    A Closer Look at Invalid Action Masking in Policy Gradient Algorithms

    Full text link
    In recent years, Deep Reinforcement Learning (DRL) algorithms have achieved state-of-the-art performance in many challenging strategy games. Because these games have complicated rules, an action sampled from the full discrete action space will typically be invalid. The usual approach to deal with this problem in policy gradient algorithms is to "mask out" invalid actions and just sample from the set of valid actions. The implications of this process, however, remain under-investigated. In this paper, we show that the standard working mechanism of invalid action masking corresponds to valid policy gradient updates. More interestingly, it works by applying a state-dependent differentiable function during the calculation of action probability distribution. Additionally, we show its critical importance to the performance of policy gradient algorithms. Specifically, our experiments show that invalid action masking scales well when the space of invalid actions is large, while the common approach of giving negative rewards for invalid actions will fail. Finally, we provide further insights by evaluating different action masking regimes, such as removing masking after an agent has been trained using masking.Comment: Preprint. Corrected a major issue of the withdrawn version submitted to NeurIPS 202
    corecore